Python Job: Site Reliability Engineer

Job added on 2022-12-13

Share Apply for Job

Job Skills

Company

Sensei Labs
Canada

Location

Remote Position
(From Everywhere/No Office Location)

Job type

Full-Time

Python Job Details

We are looking for a Site Reliability Engineer to join our team and develop software systems and automated solutions related to monitoring and alerting for our SaaS application.

You will work with our Platform & Engineering teams to ensure our application has excellent monitoring, alerting and observability so that we can deliver an excellent experience to our customers.

The ideal candidate will be passionate about the large opportunity that Sensei Labs presents. This person must thrive and succeed in delivering high quality solutions in a hyper-growth environment where priorities can shift fast. If you're looking to solve challenging technical problems and create a great product for our customers, then this is the right role for you.

RESPONSIBILITIES

Building monitoring that alerts on symptoms rather than on outages
Develop monitoring, alerting, and dashboard systems that provide good visibility into the health and state of the system
Use Chaos Engineering principles to test what you build under real-world conditions
Distinguish between monitoring for resource provisioning and monitoring for adverse events needing mitigation in other ways
Administer production jobs
Understand debugging info
Roll back a bad software push
Block or rate-limiting unwanted traffic
Bring up additional serving capacity
Working closely with internal partners and teams to ensure that we ship software that meets security, SLA, and performance requirements
Writing, updating, and using documentation, including runbooks/playbooks
Automating work including infrastructure needs, testing, failover solutions, failure mitigation, and much more
Debugging complex problems across an entire stack and creating solid solutions
Developing CI/CD processes to improve cadence

REQUIREMENTS AND SKILLS

Proven work experience as a Site Reliability Engineer or similar role
Experience with monitoring and observability such as with Datadog, Sensu, New Relic, and Nagios
Specific demonstrable experience in developing, monitoring & alerting systems
Experience debugging complex problems
Experience designing, building, and operating large-scale production systems
Knows Python, Java, Go, Rust, or similar
Understands networking and messaging, especially between services
Has hands-on experience using source control (Git, GitHub) and feature branching strategies
Has experience with a variety of databases
Experience with containers, such as with Docker or Kubernetes
Experience automating infrastructure, testing, and deployments and can explain the Infrastructure as Code paradigm
Experience with configuration management

ABOUT US:

At Sensei Labs, we're continuing to build an amazing, diverse team, and inclusive culture. Our competitive advantage is rooted in the unique perspectives and experiences of our team members. We encourage you to apply even if you don't have all the qualifications listed but want to bring new ideas and perspectives to augment our team.  

We're committed to ensuring equal access to employment opportunities for all qualified candidates, including candidates of color, women, LGBTQ+ candidates, candidates with family caregiving responsibilities, Indigenous candidates, immigrant candidates, and differently abled candidates. If you require accommodation during the application or interview process, please let us know and we’ll work with you to ensure you have a positive experience.

Job Type: Permanent

Salary: $92,447.00-$103,956.00 per year

Benefits:

Paid time off
Vision care

Schedule:

8 hour shift

Experience:

DevOps: 1 year (preferred)

Work Location: Remote

Apply for Job

Back to Python Job List

Job Skills

More Developer Job Boards

Fullstack Developer Jobs Golang Jobs JavaScript Jobs Python Jobs React Jobs Rust Jobs Java Jobs

Python Top Open-Source Projects